{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--BOOK_INFORMATION-->\n",
    "<a href=\"https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv\" target=\"_blank\"><img align=\"left\" src=\"data/cover.jpg\" style=\"width: 76px; height: 100px; background: white; padding: 1px; border: 1px solid black; margin-right:10px;\"></a>\n",
    "*This notebook contains an excerpt from the book [Machine Learning for OpenCV](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv) by Michael Beyeler.\n",
    "The code is released under the [MIT license](https://opensource.org/licenses/MIT),\n",
    "and is available on [GitHub](https://github.com/mbeyeler/opencv-machine-learning).*\n",
    "\n",
    "*Note that this excerpt contains only the raw code - the book is rich with additional explanations and illustrations.\n",
    "If you find this content useful, please consider supporting the work by\n",
    "[buying the book](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv)!*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Training an MLP in OpenCV to Classify Handwritten Digits](09.04-Training-an-MLP-in-OpenCV-to-Classify-Handwritten-Digits.ipynb) | [Contents](../README.md) | [Combining Different Algorithms Into an Ensemble](10.00-Combining-Different-Algorithms-Into-an-Ensemble.ipynb) >"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Training a Deep Neural Net to Classify Handwritten Digits Using Keras\n",
    "\n",
    "Although we achieved a formidable score with the MLP above, our result does not hold up\n",
    "to state-of-the-art results. Currently the best result has close to 99.8% accuracy—better than\n",
    "human performance! This is why nowadays, the task of classifying handwritten digits is\n",
    "largely regarded as solved.\n",
    "\n",
    "To get closer to the state-of-the-art results, we need to use state-of-the-art techniques. Thus,\n",
    "we return to Keras."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Preprocessing the MNIST dataset\n",
    "\n",
    "To make sure we get the same result every time we run the experiment, we will pick a\n",
    "random seed for NumPy's random number generator. This way, shuffling the training\n",
    "samples from the MNIST dataset will always result in the same order:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "np.random.seed(1337)  # for reproducibility"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Keras provides a loading function similar to train_test_split from scikit-learn's\n",
    "`model_selection` module. Its syntax might look strangely familiar to you:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using Theano backend.\n"
     ]
    }
   ],
   "source": [
    "from keras.datasets import mnist\n",
    "(X_train, y_train), (X_test, y_test) = mnist.load_data()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The neural nets in Keras act on the feature matrix slightly differently than the standard\n",
    "OpenCV and scikit-learn estimators. Whereas the rows of a feature matrix in Keras still\n",
    "correspond to the number of samples (`X_train.shape[0]` in the code below), we can\n",
    "preserve the two-dimensional nature of the input images by adding more dimensions to the\n",
    "feature matrix:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "img_rows, img_cols = 28, 28\n",
    "X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)\n",
    "X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)\n",
    "input_shape = (img_rows, img_cols, 1)\n",
    "    \n",
    "X_train = X_train.astype('float32') / 255.0\n",
    "X_test = X_test.astype('float32') / 255.0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we have reshaped the feature matrix into a four-dimensional matrix with dimensions\n",
    "`n_features x 28 x 28 x 1`.\n",
    "We also made sure we operate on 32-bit floating\n",
    "point numbers between [0, 1], rather than unsigned integers in [0, 255]."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then, we can one-hot encode the training labels like we did before. This will make sure each\n",
    "category of target labels can be assigned to a neuron in the output layer. We could do this\n",
    "with scikit-learn's `preprocessing`, but in this case it is easier to use Keras' own utility\n",
    "function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "from keras.utils import np_utils\n",
    "n_classes = 10\n",
    "Y_train = np_utils.to_categorical(y_train, n_classes)\n",
    "Y_test = np_utils.to_categorical(y_test, n_classes)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating a convolutional neural network\n",
    "\n",
    "Once we have preprocessed the data, it is time to define the actual model. Here, we will\n",
    "once again rely on the `Sequential` model to define a feedforward neural network:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "from keras.models import Sequential\n",
    "model = Sequential()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, this time, we will be smarter about the individual layers. We will design our\n",
    "neural network around a **convolutional layer**, where the kernel is a 3 x 3 pixel two-dimensional\n",
    "convolution.\n",
    "\n",
    "A two-dimensional convolutional layer operates akin to image filtering in\n",
    "OpenCV, where each image in the input data is convolved with a small\n",
    "two-dimensional kernel. In Keras, we can specify the kernel size and the\n",
    "stride:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "from keras.layers import Conv2D\n",
    "n_filters = 32\n",
    "kernel_size = (3, 3)\n",
    "model.add(Conv2D(n_filters, (kernel_size[0], kernel_size[1]),\n",
    "                 padding='valid',\n",
    "                 input_shape=input_shape))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After that, we will use a linear rectified unit as an activation function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from keras.layers import Activation\n",
    "model.add(Activation('relu'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In a deep convolutional neural net, we can have as many layers as we want. A popular\n",
    "version of this structure applied to MNIST involves performing the convolution and\n",
    "rectification twice:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.add(Conv2D(n_filters, (kernel_size[0], kernel_size[1])))\n",
    "model.add(Activation('relu'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we will pool the activations and add a `Dropout` layer:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "from keras.layers import MaxPooling2D, Dropout\n",
    "pool_size = (2, 2)\n",
    "model.add(MaxPooling2D(pool_size=pool_size))\n",
    "model.add(Dropout(0.25))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then we will flatten the model and finally pass it through a `softmax` function to arrive at\n",
    "the output layer:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "from keras.layers import Flatten, Dense\n",
    "model.add(Flatten())\n",
    "model.add(Dense(128))\n",
    "model.add(Activation('relu'))\n",
    "model.add(Dropout(0.5))\n",
    "model.add(Dense(n_classes))\n",
    "model.add(Activation('softmax'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, we will use the cross-entropy loss and the Adadelta algorithm:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "model.compile(loss='categorical_crossentropy',\n",
    "              optimizer='adadelta',\n",
    "              metrics=['accuracy'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fitting the model\n",
    "\n",
    "We fit the model like we do with all other classifiers:\n",
    "\n",
    "> Caution! This might take several hours depending on your machine."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Train on 60000 samples, validate on 10000 samples\n",
      "Epoch 1/12\n",
      "60000/60000 [==============================] - 88s - loss: 0.3501 - acc: 0.8929 - val_loss: 0.0887 - val_acc: 0.9714\n",
      "Epoch 2/12\n",
      "60000/60000 [==============================] - 87s - loss: 0.1263 - acc: 0.9618 - val_loss: 0.0596 - val_acc: 0.9803\n",
      "Epoch 3/12\n",
      "60000/60000 [==============================] - 87s - loss: 0.0999 - acc: 0.9705 - val_loss: 0.0475 - val_acc: 0.9844\n",
      "Epoch 4/12\n",
      "60000/60000 [==============================] - 88s - loss: 0.0839 - acc: 0.9752 - val_loss: 0.0416 - val_acc: 0.9861\n",
      "Epoch 5/12\n",
      "60000/60000 [==============================] - 87s - loss: 0.0742 - acc: 0.9780 - val_loss: 0.0384 - val_acc: 0.9876\n",
      "Epoch 6/12\n",
      "60000/60000 [==============================] - 87s - loss: 0.0661 - acc: 0.9802 - val_loss: 0.0366 - val_acc: 0.9876\n",
      "Epoch 7/12\n",
      "60000/60000 [==============================] - 87s - loss: 0.0603 - acc: 0.9814 - val_loss: 0.0378 - val_acc: 0.9869\n",
      "Epoch 8/12\n",
      "60000/60000 [==============================] - 87s - loss: 0.0580 - acc: 0.9825 - val_loss: 0.0351 - val_acc: 0.9880\n",
      "Epoch 9/12\n",
      "60000/60000 [==============================] - 87s - loss: 0.0535 - acc: 0.9830 - val_loss: 0.0325 - val_acc: 0.9881\n",
      "Epoch 10/12\n",
      "60000/60000 [==============================] - 87s - loss: 0.0518 - acc: 0.9845 - val_loss: 0.0305 - val_acc: 0.9893\n",
      "Epoch 11/12\n",
      "60000/60000 [==============================] - 87s - loss: 0.0490 - acc: 0.9854 - val_loss: 0.0308 - val_acc: 0.9899\n",
      "Epoch 12/12\n",
      "60000/60000 [==============================] - 87s - loss: 0.0459 - acc: 0.9862 - val_loss: 0.0301 - val_acc: 0.9898\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<keras.callbacks.History at 0x7f28d9adc390>"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.fit(X_train, Y_train, batch_size=128, epochs=12,\n",
    "          verbose=1, validation_data=(X_test, Y_test))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After training completes, we can evaluate the classifier:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 9920/10000 [============================>.] - ETA: 0s"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[0.030051206669808015, 0.98980000000000001]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.evaluate(X_test, Y_test, verbose=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we achieve 99.25% accuracy! Worlds apart from the MLP classifier we implemented\n",
    "before. And this is just one way to do things. As you can see, neural networks provide a\n",
    "plethora of tuning parameters, and it is not at all clear which ones will lead to the best\n",
    "performance."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Training an MLP in OpenCV to Classify Handwritten Digits](09.04-Training-an-MLP-in-OpenCV-to-Classify-Handwritten-Digits.ipynb) | [Contents](../README.md) | [Combining Different Algorithms Into an Ensemble](10.00-Combining-Different-Algorithms-Into-an-Ensemble.ipynb) >"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
